AITopics | xgboost and lightgbm

3ffebb08d23c609875d7177ee769a3e9-Paper.pdf

Neural Information Processing SystemsFeb-12-2026, 00:13:35 GMT

dataset, sparrow, weak rule, (16 more...)

Neural Information Processing Systems

Country:

Asia > Japan (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)
Information Technology > Data Science > Data Mining (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

Faster Boosting with Smaller Memory

Neural Information Processing SystemsDec-25-2025, 07:13:13 GMT

State-of-the-art implementations of boosting, such as XGBoost and LightGBM, can process large training sets extremely fast. However, this performance requires that the memory size is sufficient to hold a 2-3 multiple of the training set size. This paper presents an alternative approach to implementing the boosted trees, which achieves a significant speedup over XGBoost and LightGBM, especially when the memory size is small. This is achieved using a combination of three techniques: early stopping, effective sample size, and stratified sampling. Our experiments demonstrate a 10-100 speedup over XGBoost when the training data is too large to fit in memory.

electronic proceedings, name change, smaller memory, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

3ffebb08d23c609875d7177ee769a3e9-Paper.pdf

Neural Information Processing SystemsOct-2-2025, 14:53:26 GMT

data mining, machine learning, weak rule, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California > San Diego County (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)
Information Technology > Data Science > Data Mining (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

Faster Boosting with Smaller Memory

Neural Information Processing SystemsMay-27-2025, 10:16:15 GMT

State-of-the-art implementations of boosting, such as XGBoost and LightGBM, can process large training sets extremely fast. However, this performance requires that the memory size is sufficient to hold a 2-3 multiple of the training set size. This paper presents an alternative approach to implementing the boosted trees, which achieves a significant speedup over XGBoost and LightGBM, especially when the memory size is small. This is achieved using a combination of three techniques: early stopping, effective sample size, and stratified sampling. Our experiments demonstrate a 10-100 speedup over XGBoost when the training data is too large to fit in memory.

memory size, smaller memory, xgboost and lightgbm, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Enhancing the Product Quality of the Injection Process Using eXplainable Artificial Intelligence

Hong, Jisoo, Hong, Yongmin, Baek, Jung-Woo, Kang, Sung-Woo

arXiv.org Artificial IntelligenceMar-4-2025

The injection molding process is a traditional technique for making products in various industries such as electronics and automobiles via solidifying liquid resin into certain molds. Although the process is not related to creating the main part of engines or semiconductors, this manufacturing methodology sets the final form of the products. Re-cently, research has continued to reduce the defect rate of the injection molding process. This study proposes an optimal injection molding process control system to reduce the defect rate of injection molding products with XAI (eXplainable Artificial Intelligence) ap-proaches. Boosting algorithms (XGBoost and LightGBM) are used as tree-based classifiers for predicting whether each product is normal or defective. The main features to control the process for improving the product are extracted by SHapley Additive exPlanations, while the individual conditional expectation analyzes the optimal control range of these extracted features. To validate the methodology presented in this work, the actual injection molding AI manufacturing dataset provided by KAMP (Korea AI Manufacturing Platform) is employed for the case study. The results reveal that the defect rate decreases from 1.00% (Original defect rate) to 0.21% with XGBoost and 0.13% with LightGBM, respectively.

injection molding process, main feature, molding process, (11 more...)

arXiv.org Artificial Intelligence

2503.02338

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > Poland (0.04)
Asia > Taiwan (0.04)
(9 more...)

Genre: Research Report (0.83)

Industry:

Transportation (0.55)
Energy (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.92)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.86)

Add feedback

Faster Boosting with Smaller Memory

Neural Information Processing SystemsOct-9-2024, 21:53:08 GMT

State-of-the-art implementations of boosting, such as XGBoost and LightGBM, can process large training sets extremely fast. However, this performance requires that the memory size is sufficient to hold a 2-3 multiple of the training set size. This paper presents an alternative approach to implementing the boosted trees, which achieves a significant speedup over XGBoost and LightGBM, especially when the memory size is small. This is achieved using a combination of three techniques: early stopping, effective sample size, and stratified sampling. Our experiments demonstrate a 10-100 speedup over XGBoost when the training data is too large to fit in memory.

memory size, smaller memory, xgboost and lightgbm, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Biomarker based Cancer Classification using an Ensemble with Pre-trained Models

Lee, Chongmin, Kim, Jihie

arXiv.org Machine LearningJun-14-2024

Certain cancer types, namely pancreatic cancer is difficult to detect at an early stage; sparking the importance of discovering the causal relationship between biomarkers and cancer to identify cancer efficiently. By allowing for the detection and monitoring of specific biomarkers through a non-invasive method, liquid biopsies enhance the precision and efficacy of medical interventions, advocating the move towards personalized healthcare. Several machine learning algorithms such as Random Forest, SVM are utilized for classification, yet causing inefficiency due to the need for conducting hyperparameter tuning. We leverage a meta-trained Hyperfast model for classifying cancer, accomplishing the highest AUC of 0.9929 and simultaneously achieving robustness especially on highly imbalanced datasets compared to other ML algorithms in several binary classification tasks (e.g. breast invasive carcinoma; BRCA vs. non-BRCA). We also propose a novel ensemble model combining pre-trained Hyperfast model, XGBoost, and LightGBM for multi-class classification tasks, achieving an incremental increase in accuracy (0.9464) while merely using 500 PCA features; distinguishable from previous studies where they used more than 2,000 features for similar results.

cancer, classification, ensemble model, (13 more...)

arXiv.org Machine Learning

2406.10087

Country:

North America > Montserrat (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)

Genre: Research Report > Experimental Study (0.34)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology > Carcinoma (0.49)
Health & Medicine > Therapeutic Area > Oncology > Pancreatic Cancer (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.90)

Add feedback

XGBoost vs LightGBM on a High Dimensional Dataset

#artificialintelligenceSep-29-2020, 15:01:23 GMT

I have recently completed a multi-class classification problem given as a take-home assignment for a data scientist position. It was a good opportunity to compare the two state-of-the-art implementations of gradient boosting decision trees which are XGBoost and LightGBM. Both algorithms are so powerful that they are prominent among the best performing machine learning models. The dataset contains over 60 thousand observations and 103 numerical features. The target variable contains 9 different classes.

artificial intelligence, lightgbm, machine learning, (3 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)

Add feedback

Faster Boosting with Smaller Memory

Alafate, Julaiti, Freund, Yoav S.

Neural Information Processing SystemsMar-19-2020, 01:16:44 GMT

State-of-the-art implementations of boosting, such as XGBoost and LightGBM, can process large training sets extremely fast. However, this performance requires that the memory size is sufficient to hold a 2-3 multiple of the training set size. This paper presents an alternative approach to implementing the boosted trees, which achieves a significant speedup over XGBoost and LightGBM, especially when the memory size is small. This is achieved using a combination of three techniques: early stopping, effective sample size, and stratified sampling. Our experiments demonstrate a 10-100 speedup over XGBoost when the training data is too large to fit in memory. Papers published at the Neural Information Processing Systems Conference.

memory size, smaller memory, xgboost and lightgbm, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Malware Classification using Machine Learning

#artificialintelligenceOct-13-2019, 21:08:18 GMT

If you love to explore large and challenging data sets, then probably you should give Microsoft Malware Classification a try. Before diving deep in to the problem let's take few points on what can you expect to learn from this: In the past few years, the malware industry has grown very rapidly that, the syndicates invest heavily in technologies to evade traditional protection, forcing the anti-malware groups/communities to build more robust software to detect and terminate these attacks. The major part of protecting a computer system from a malware attack is to identify whether a given piece of file/software is a malware. We can map the business problem to a multi-class classification problem, where we need to predict the class for each given byte files among nine categories (Ramnit, Lollipop, Kelihos_ver3, Vundo, Simda,Tracur, Kelihos_ver1, Obfuscator.ACY, Gatak). Constrains: We need to provide the class probability, wrongly classified class labels should be penalized(that's why log loss has been chosen as KPI) and there should some latency bound.

asm file, byte file, malware classification, (14 more...)

#artificialintelligence

Industry: Information Technology > Security & Privacy (1.00)

Technology: